Code
library(tidyverse)
::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE) knitr
Young Soo Choi
August 17, 2022
Today’s challenge is to:
pivot_longer
Read the data regarding eggs
# A tibble: 120 × 6
month year large_half_dozen large_dozen extra_large_half_dozen extra_l…¹
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 January 2004 126 230 132 230
2 February 2004 128. 226. 134. 230
3 March 2004 131 225 137 230
4 April 2004 131 225 137 234.
5 May 2004 131 225 137 236
6 June 2004 134. 231. 137 241
7 July 2004 134. 234. 137 241
8 August 2004 134. 234. 137 241
9 September 2004 130. 234. 136. 241
10 October 2004 128. 234. 136. 241
# … with 110 more rows, and abbreviated variable name ¹extra_large_dozen
# ℹ Use `print(n = ...)` to see more rows
This dataset shows that the price of eggs categorizied by their size from 2004 I think. Because this data has 6 columns, it cannot be easily recognized at once. So I will pivot it. The 4 columns about the types will be pivoted to “types” column.
In this ‘eggs’ dataset 2 of the variables are used to identify a case. So expected rows are 480 and columns are 4.
# A tibble: 480 × 4
month year types price
<chr> <dbl> <chr> <dbl>
1 January 2004 large_half_dozen 126
2 January 2004 large_dozen 230
3 January 2004 extra_large_half_dozen 132
4 January 2004 extra_large_dozen 230
5 February 2004 large_half_dozen 128.
6 February 2004 large_dozen 226.
7 February 2004 extra_large_half_dozen 134.
8 February 2004 extra_large_dozen 230
9 March 2004 large_half_dozen 131
10 March 2004 large_dozen 225
# … with 470 more rows
# ℹ Use `print(n = ...)` to see more rows
# A tibble: 480 × 4
year month types price
<dbl> <chr> <chr> <dbl>
1 2004 January large_half_dozen 126
2 2004 January large_dozen 230
3 2004 January extra_large_half_dozen 132
4 2004 January extra_large_dozen 230
5 2004 February large_half_dozen 128.
6 2004 February large_dozen 226.
7 2004 February extra_large_half_dozen 134.
8 2004 February extra_large_dozen 230
9 2004 March large_half_dozen 131
10 2004 March large_dozen 225
# … with 470 more rows
# ℹ Use `print(n = ...)` to see more rows
Any additional comments?
It has changed to 480 rows and 4 columns dataset.
---
title: "Challenge 3"
author: "Young Soo Choi"
desription: "Tidy Data: Pivoting"
date: "08/17/2022"
format:
html:
toc: true
code-fold: true
code-copy: true
code-tools: true
categories:
- challenge_3
---
```{r}
#| label: setup
#| warning: false
#| message: false
library(tidyverse)
knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)
```
## Challenge Overview
Today's challenge is to:
1. read in a data set, and describe the data set using both words and any supporting information (e.g., tables, etc)
2. identify what needs to be done to tidy the current data
3. anticipate the shape of pivoted data
4. pivot the data into tidy format using `pivot_longer`
## Read in data
Read the data regarding eggs
```{r}
eggs <- read_csv("_data/eggs_tidy.csv")
eggs
```
### Briefly describe the data
This dataset shows that the price of eggs categorizied by their size from 2004 I think. Because this data has 6 columns, it cannot be easily recognized at once. So I will pivot it. The 4 columns about the types will be pivoted to "types" column.
## Anticipate the End Result
In this 'eggs' dataset 2 of the variables are used to identify a case. So expected rows are 480 and columns are 4.
```{r}
#existing rows/cases
nrow(eggs)
#existing columns/cases
ncol(eggs)
#expected rows/cases
nrow(eggs) * (ncol(eggs)-2)
# expected columns
2 + 2
```
## Pivot the Data
```{r}
#pivot data
pivot_eggs<-pivot_longer(eggs,col=c(large_half_dozen, large_dozen, extra_large_half_dozen, extra_large_dozen), names_to="types", values_to="price")
pivot_eggs
#change the order of columns
pivot_eggs<-pivot_eggs[c(2,1,3,4)]
pivot_eggs
```
Any additional comments?
It has changed to 480 rows and 4 columns dataset.